Toward a model for lexical access based on acoustic landmarks and distinctive features.
نویسنده
چکیده
This article describes a model in which the acoustic speech signal is processed to yield a discrete representation of the speech stream in terms of a sequence of segments, each of which is described by a set (or bundle) of binary distinctive features. These distinctive features specify the phonemic contrasts that are used in the language, such that a change in the value of a feature can potentially generate a new word. This model is a part of a more general model that derives a word sequence from this feature representation, the words being represented in a lexicon by sequences of feature bundles. The processing of the signal proceeds in three steps: (1) Detection of peaks, valleys, and discontinuities in particular frequency ranges of the signal leads to identification of acoustic landmarks. The type of landmark provides evidence for a subset of distinctive features called articulator-free features (e.g., [vowel], [consonant], [continuant]). (2) Acoustic parameters are derived from the signal near the landmarks to provide evidence for the actions of particular articulators, and acoustic cues are extracted by sampling selected attributes of these parameters in these regions. The selection of cues that are extracted depends on the type of landmark and on the environment in which it occurs. (3) The cues obtained in step (2) are combined, taking context into account, to provide estimates of "articulator-bound" features associated with each landmark (e.g., [lips], [high], [nasal]). These articulator-bound features, combined with the articulator-free features in (1), constitute the sequence of feature bundles that forms the output of the model. Examples of cues that are used, and justification for this selection, are given, as well as examples of the process of inferring the underlying features for a segment when there is variability in the signal due to enhancement gestures (recruited by a speaker to make a contrast more salient) or due to overlap of gestures from neighboring segments.
منابع مشابه
مدلسازی بازشناسی واجی کلمات فارسی
Abstract of spoken word recognition is proposed. This model is particularly concerned with extraction of cues from the signal leading to a specification of a word in terms of bundles of distinctive features, which are assumed to be the building blocks of words. In the model proposed, auditory input is chunked into a set of successive time slices. It is assumed that the derivation of the underly...
متن کاملA Landmark-based Model of Speech Perception: History and Recent Developments
This paper traces some of the history of the development of a model for speech perception in which words are assumed to be represented as sequences of bundles of binary distinctive features. In the model, probability estimates for feature values are derived from measurements of acoustic attributes in the vicinity of acoustic “landmarks.” Landmarks are detected based on amplitude changes in vari...
متن کاملNasal detection module for a knowledge-based speech recognition system
The Lexical Access From Features (LAFF) project tries to model the representation and perception of speech by human listeners. The derivation of such a representation involves first finding certain acoustic landmarks. Based on the landmarks and the acoustic cues surrounding the landmarks, distinctive features of the speech segments may be deciphered. The present study concentrates on the nasali...
متن کاملPersian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملAcoustic Cues, Landmarks, and Distinctive Features: a Model of Human Speech Processing
Four aspects of human speech processing are discussed along with their impact on the fundamental structure of a model of the human lexical access process (Stevens, 2002): (1) the lexical representation, (2) sensitivity observed in auditory processing, (3) multiple and graded activations of lexical candidates, and (4) contextual variation. The model assumes that the lexicon is represented in ter...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- The Journal of the Acoustical Society of America
دوره 111 4 شماره
صفحات -
تاریخ انتشار 2002